Equilibrium Policy Gradients for Spatiotemporal Planning
نویسنده
چکیده
In spatiotemporal planning, agents choose actions at multiple locations in space over some planning horizon to maximize their utility and satisfy various constraints. In forestry planning, for example, the problem is to choose actions for thousands of locations in the forest each year. The actions at each location could include harvesting trees, treating trees against disease and pests, or doing nothing. A utility model could place value on sale of forest products, ecosystem sustainability or employment levels, and could incorporate legal and logistical constraints such as avoiding large contiguous areas of clearcutting and managing road access. Planning requires a model of the dynamics. Existing simulators developed by forestry researchers can provide detailed models of the dynamics of a forest over time, but these simulators are often not designed for use in automated planning. This thesis presents spatiotemoral planning in terms of factored Markov decision processes. A policy gradient planning algorithm optimizes a stochastic spatial policy using existing simulators for dynamics. When a planning problem includes spatial interaction between locations, deciding on an action to carry out at one location requires considering the actions performed at other locations. This spatial interdependence is common in forestry and other environmental planning problems and makes policy representation and planning challenging. We define a spatial policy in terms of local policies defined as distributions over actions at one location conditioned upon actions at other locations. A policy gradient planning algorithm using this spatial policy is presented which uses Markov Chain Monte Carlo simulation to sample the
منابع مشابه
Land use and land cover spatiotemporal dynamic pattern and predicting changes using integrated CA-Markov model
Analyzing the process of land use and cover changes during long periods of time and predicting the future changes is highly important and useful for the land use managers. In this study, the land use maps in the Ardabil plain in north-west part of Iran for four periods (1989, 1998, 2009 and 2013) are extracted and analyzed through remote sensing technique, using the land-sat satellite images. T...
متن کاملBayesian modeling and analysis for gradients in spatiotemporal processes.
Stochastic process models are widely employed for analyzing spatiotemporal datasets in various scientific disciplines including, but not limited to, environmental monitoring, ecological systems, forestry, hydrology, meteorology, and public health. After inferring on a spatiotemporal process for a given dataset, inferential interest may turn to estimating rates of change, or gradients, over spac...
متن کاملModel-Free Imitation Learning with Policy Optimization
In imitation learning, an agent learns how to behave in an environment with an unknown cost function by mimicking expert demonstrations. Existing imitation learning algorithms typically involve solving a sequence of planning or reinforcement learning problems. Such algorithms are therefore not directly applicable to large, high-dimensional environments, and their performance can significantly d...
متن کاملGovernment and Central Bank Interaction under Uncertainty: A Differential Games Approach
Abstract Today, debt stabilization in an uncertain environment is an important issue. In particular, the question how fiscal and monetary authorities should deal with this uncertainty is of much importance. Especially for some developing countries such as Iran, in which on average 60 percent of government revenues comes from oil, and consequently uncertainty about oil prices has a large effec...
متن کاملDynamic calcium movement inside cardiac sarcoplasmic reticulum during release.
RATIONALE Intra-sarcoplasmic reticulum (SR) free [Ca] ([Ca](SR)) provides the driving force for SR Ca release and is a key regulator of SR Ca release channel gating during normal SR Ca release or arrhythmogenic spontaneous Ca release events. However, little is known about [Ca](SR) spatiotemporal dynamics. OBJECTIVE To directly measure local [Ca](SR) with subsarcomeric spatiotemporal resolutio...
متن کامل